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Evaluating and Optimizing Error-Correcting Codes using a Renormalization 

Group Transformation 

Field of the Invention 

The present invention relates generally to the field of error-correcting codes for 
data storage and data transmission, and more particularly to evaluating and 
optimizing error-correcting codes for intermediate length data blocks. 

Background of the Invention 

A fundamental problem in the field of data storage and communication is 
generating optimal or near optimal error-correcting codes (ECC) for data of a 
given block-length and transmission rate that can also be practically decoded. This 
problem is now nearly solved for small block-lengths, e.g., blocks of length N < 
100 bits, and for very large block-lengths, e.g., N > 10 6 bits. However, error- 
correcting codes that are used in many applications, for example, wireless 
communication, typically have block-lengths in an intermediate range, around AT = 
2000 bits. Generating optimal codes for these block-length remains a problem. 

A large number of error-correcting codes are known for small block-lengths, many 
of which are known to be optimal, or near optimal. As long as the block-length is 
small enough, these ECC can be decoded practically and optimally using 
maximum-likelihood decoders. 
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The problem of finding optimal codes for very large block-lengths has been 
essentially solved by parity-check codes defined by generalized parity check 
matrices. These types of codes were first described by R. G. Gallager, in "Low- 
density parity check codes/' Vol.21, Research Monograph Series, MIT Press, 
5 1963, but were not properly appreciated until recently. More recently, improved 
codes defined by sparse generalized parity check matrices have been described, 
such as turbocodes, irregular low-density parity check (LDPC) codes, Kanter-Saad 
codes, repeat-accumulate codes, and irregular repeat-accumulate codes. 

10 These improved codes have three particularly noteworthy advantages. First, the 
codes can be decoded efficiently using belief propagation (BP) iterative decoding. 

jjS Second, the performance of these codes can often be theoretically analyzed using a 

§7? density evolution method, in an infinite-block-length limit. Third, by using a 

j£j density evolution method, it can be demonstrated that these codes are nearly 

;U, 15 optimal codes. In the infinite-block-length limit, BP decoding of these codes 

w decodes all data blocks that have a noise level below some threshold level, and that 

0 1 threshold level is often not far from the Shannon limit. 

The preferred prior art way for generating improved codes has been to optimize 
20 codes for the infinite block-length limit using density evolution, and hope that a 
scaled-down version still results in a near optimal code. The problem with this 
method is that for N < 10 4 , at least, the block-length is still noticeably far from the 
infinite-block-length limit. In particular, many decoding failures are found at noise 
levels far below the threshold level predicted by infinite block-length calculations. 
25 Furthermore, there may not necessarily even exist a way to scale down the codes 
derived from the density evolution method. 
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For example, the best known irregular LDPC codes, at a given rate in the N — > °° 
limit, often have variable nodes that should participate in hundreds or even 
thousands of parity checks, which makes no sense when the overall number of 
5 parity checks is 100 or less. 



Density Evolution Method for a Binary Erasure Channel (BEC) 

(ys/ The density evolution method is simple fop^Dinary erasure channel. A binary 
_ 10 erasure channel is a binary input diafmel with three output symbols: 0, 1, and an 
ifi erasure, which can be represented by a question mark "?." Because this method is 
important backgrouijd^or the method according to the invention, it is distinguished 
in greater detaj 



w 

III 

y \ 



U 15 Parity Check Codes 

[1; Linear block binary error-correcting codes can be defined in terms of a parity 
check matrix. In a parity check matrix A, the columns represent transmitted 
variable bits, and the rows define linear constraints or checks between the variable 
20 bits. More specifically, the matrix A defines a set of valid vectors or codewords z, 
such that each component of z is either 0 or 1, and 

Az = 0, (1) 
where all multiplication and addition are modulo 2. 



25 If the parity check matrix has Af columns and N-k rows, then the parity check 

defines an error correcting code of block-length AT and transmission rate k/N, unless 
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some of the rows are linearly dependent, in which case some of the parity checks 
are redundant, and the code actually has a higher transmission rate. 

As shown in Figure 1 , there is a corresponding Tanner graph for each parity check 
matrix, see R. M. Tanner, "A recursive method to low complexity codes," 
IEEE Trans. Info. Theory, IT-27, pages 533-547, 1981. The Tanner graph 100 is a 
bipartite graph with two types of nodes: variable nodes i denoted by circles, and 
check nodes a denoted by squares. In the Tanner graph, each variable node is 
connected to the check node participating in the check for the variable node i. 
For example, the parity check matrix 



is represented by the bipartite Tanner graph shown in Figure 1 . 

It should be understood, that in practical applications the graphs typically include 
thousands of nodes connected in any number of different ways, and containing 
many loops. Analyzing such graphs to determine optimal configurations is 
difficult. 

Error-correcting codes defined by parity check matrices are linear. This means that 
each codeword is a linear combination of other codewords. In a check matrix, there 
are 2 k possible codewords, each of length N. For the example given the above, the 



(\ 1 0 1 0 0^1 



A = 



10 10 10 



(2) 



0 110 0 1 
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codewords are 000000, 001011,010110,011101, 100110, 101101, 110011, 
1 1 1000. Because of the linearity property, any of the codewords are representative. 
For the purposes of analyzing a code, it is therefore normally assumed that the all- 
zeros codeword is transmitted. 

Belief Propagation Decoding in the BEC 

An input bit passes through the binary erasure channel as an erasure with 
probability x and is received correctly with probability 1 - x. It is important to note 
that the BEC never flips bits from 0 to 1, or vice versa. If all-zeros codewords are 
transmitted, the received word must consist entirely of zeros and erasures. 

The receiver uses a belief propagation (BP) decoder to decode the input bits by 
passing discrete messages between the nodes of the Tanner graph. A message m ia 
is sent from each variable node i to each check node a connected to it. The 
message represents the state of the variable node /. In general, the message can be 
in one of three states: 1, 0, or ?, but because the all-zeros codeword is always 
transmitted, the possibility that m^ has value 1 can be ignored. 

Similarly, there is a message m ai sent from each check^dae a to all the variable 
nodes i connected to the check node. These messages are interpreted as directives 
from the check node a to the variable node J/^bout what state the variable node 
should be in. This message is based on tHe states of the other variable nodes 
connected to the check node. The ch^ck-to-bit messages can, in principle, take on 
the values 0, 1, or ?, but again only the two messages 0 and ? are relevant when the 
all-zeros codeword is transmitted. 
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In the BP decoding process for the BEC, a message m ia from a variable node i to a 
check node a is equal to a non-erasure received message because such messages 
are always correct in the BEC, or to an erasure when all incoming messages are 
5 erasures. A message m Qi from a check node a to a variable node / is an erasure 
when any incoming message from another node participating in the check is an 
erasure; otherwise it takes on the value of the binary sum of all incoming messages 
from other nodes participating in the check. 

10 BP decoding is iterative. The messages are initialized so that all variable nodes that 

O 

2 are not erased by the channel send out messages equal to the corresponding 

1,13 

j r received bit, and all other messages are initially erasures. Iterating the BP message 

is 

y process converges eventually to stationary messages because convergence of BP 

decoding is guaranteed for the particularly simple BEC, though not for other 

L 15 channels. The final decoded value of any erased variable node is just the value of 

^ any non-erasure message coming into that node, unless there is no incoming non- 

j*j erasure message. In this case, the BP decoding process terminates and fails to 

v* decode the particular variable node. 

20 Density Evolution 

The average probability of failure for BP decoding over many blocks is now 
considered. A real number p ia , which represents the probability that the message 
m ia is an erasure, is associated with each message m ia . Similarly, a real number q ai , 
25 which represents the probability that the message m ai is an erasure, is associated 
with each message m ai . In the density evolution method, probabilities p ia and q ai are 
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determined in a way that is exact, as long as the Tanner graph representing the 
error-correcting code has no loops. 



The equation forV a is 

Pia= X lH/> 

beN(i)\a 



(3) 





25 



where b e N(i)\a represents all check nodes directly connected to a neighboring 
variable node i, exceptor check node a. This equation can be derived from the 
fact that for a message m\ to be an erasure, the variable node i must be erased in 
transmission, and all incoming messages from other checks are erasures as well. Of 
course, if the incoming messages are correlated, then this equation is not correct. 
However, in a Tanner graph witn\no loops, each incoming message is independent 
of all other messages. 

Similarly, the equatkm 

*-=i-na\o (4) 

jeN(a)\i \ 

can be derived from the facttrmt a message q ai can only be in a 0 or 1 state when 
all incoming messages are in either a zero or one state. 

The density evolution equations (3) and (4) can be solved by-keration. A good 
initialization is p ia = x for all messages from variable^nda^s to check nodes and 
q ai = 0 for all messages from check nodes to^vanable nodes, as long as the iteration 
begins with the q ai messages. The BEpdensity evolution equations ultimately 
converge. This can be guaranteepkfor codes defined in graphs without loops. It is 
possible to determine b iy wl)icn is the probability of a failure to decode at variable 
node i, from the formul 
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= x \Xq ai ■ (5) 



Exact Solution of a Small Code 

5 As stated above, the density evolution equations (3, 4, and 5) are exact when the 
code has a Tanner graph representation without loops. 

Consider the error-correcting code defined by a parity check matrix 



fl 1 0 0^ 
0 111 



(6) 



m and represented by a corresponding Tanner graph shown in Figure 2. This code has 



four codewords: 0000, 001 1, 1 101, and 1 1 10. If the 0000 message is transmitted, 
^ then there are sixteen possible received messages: 0000, 000?, 00?0, 00??, 0?00, 

O 4 

in and so on. The probability of receiving a message with n e erasures is x ne ( 1 -x) ~ ne . 
jtj 15 Messages might be partially or completely decoded by the BP decoder; for 

example the received message ?00? is fully decoded to 0000, but the message 0??? 
is only partially decoded to 00??, because there is not enough information to 
determine whether the transmitted codeword was actually 0000 or 001 1. 

20 It is easy to determine the exact probability that a given bit remains an erasure after 
decoding by summing over the sixteen possible received messages weighted by 
their probabilities. For example, the first bit is only decoded as an erasure when 
one of the following messages is received: ???0, ??0?, or ????, so the correct 



probability that the first bit is not decoded is 2x 3 (l - x) + x 4 = 2jc 3 - x 4 . 



25 
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If the focus is on the last bit, then the message is decoded/unless one of the 
following messages is sent: 00??, 0???, ?0??, ??0? or ????. Therefore, the overall 
probability that the fourth bit is not decoded is x 2 (l -A) 2 + 3jc 3 (1 - x) + x 4 = x 2 + x 3 
x 4 . In the density evolution method, the values for/me following variables: 

Pll , Pl\ , P22 » P32 , P<X2 , <lU , <ll2, <l22, #23, $24, b\, b 2 , b 3 , & 4 

are determined by equations 



(7) 
(8) 
(9) 
(10) 

(11) 
(12) 

(13) 
(14) 
(15) 



Solving these equations yields 



(16) 
(17) 
(18) 
(19) 



(20) 
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and 



p lx - 2x 2 - x 3 
p 22 =x 2 

P42= X 

p u = 2x 2 - x 3 

q n =x 

q 22 =2x-x 2 



and 



q 23 = x + x 2 - x 3 



q 24 = x + x 2 - x 3 



b x = 2x 3 -x 4 
b 2 =2x 3 -x 4 
= x 2 + x 3 - x 4 



b 4 = x 2 + x 3 - x 4 



(21) 
(22) 
(23) 
(24) 
(25) 

(26) 
(27) 
(28) 
(29) 



(30) 

(31) 
(32) 
(33) 



Examining the results for b\ and fc 4 indicates that the density evolution solution 
agrees exactly with the direct approach for this code. 



The Large Block-Length Limit 

If all local neighborhoods in the Tanner graph are identical, the density evolution 
equations can be simplified. For example, if each variable node i is connected to d v 
parity checks, and each check node a is connected to d c variable nodes, then all the 
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p ia are equal to the same value /?, all the q/i are equal to the same value q, and all 
the bi are equal to the same value b. Thpn, 

P = xq'*- 1 / ' (34) 

q = l-(l- p y<- 1 / (35) 

5 and 

b = V / (36) 

which are the density evolution equations for (d v , d c ) regular Gallager codes, valid 
in the AT — > oo limit. A/regular Gallager code is a sparse random parity check 
matrix characterized/by the restriction that each row has exactly d c ones in it, and 
Q 10 each column contains exactly d v ones. 

m 

m The intuitive reason that these equations are valid, in the infinite block-length 

iVj 

limit, is that as N — > <» , the size of typical loops in the Tanner graph of a regular 
^ Gallager code go to infinity, so all incoming messages to a node are independent, 
15 and a regular Gallager code behaves as a code defined on a graph without loops. 
Solving equations (34 and 35) for specific values of d v and d c yields a solution that 
is p = q = b = 0, below a critical erasure limit of x c . This means that decoding is 
perfect. Above jc c , b has a non-zero solution, which correspond to decoding 
failures. The value x c is easy to determine numerically. For example, if d v = 3 and 
20 d c = 5, then jc c - 0.51757. 

These density evolution calculations can be generalized to irregular Gallager 
codes, or other codes like irregular repeat-accumulate codes which have a finite 
number of different classes of nodes with different neighborhoods. In this 
25 generalization, one can derive a system of equations, typically with one equation 
for the messages leaving each class of node. By solving the system of equations, 
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one can again find a critical threshold x CJ below which decoding is perfect. Such 
codes can thus be optimized, in the N — > oo limit, by finding the code that has 
maximal noise threshold x c . 

5 Unfortunately, the density evolution method is erroneous for codes with finite 
block-lengths. One might think that it is possible to solve equations (3 and 4) for 
any finite code, and hope that ignoring the presence of loops is not too important a 
mistake. However, this does not work out, as can be seen by considering regular 
Gallager codes. Equations (3, 4, and 5) for a finite block-length regular Gallager 
10 code have exactly the same solutions as one would find in the infinite-block-length 
limit, so one would not predict any finite-size effects. However, it is known that 
the real performance of finite-block-length regular Gallager codes is considerably 
worse than that predicted by such a naive method. 



□ 15 Therefore, there is a need for a method to correctly evaluate finite length error- 
correcting codes that do not suffer from the problems of the prior art methods. 

Summary of the Invention 

20 The present invention provides a method for evaluating an error correcting code 
for a data block of a finite size in a binary erasure channel or an additive white 
Gaussian noise channel. An error-correcting code is defined by a generalized parity 
check matrix, wherein columns represent variable bits and rows represent parity 
bits. In a notation of the matrix, "hidden" variable bits which are not transmitted 
25 through the channel are represented by a bar over the corresponding column of the 
generalized parity check matrix. The generalized parity check matrix is represented 
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as a bipartite graph. A single node in the bipartite graph is iteratively renormalized 
until the number of nodes in the bipartite graph is less than a predetermined 
threshold. 

5 During the iterative renormalization, a particular variable node is selected as a 
target node, and a distance between the target node and every other node in the 
bipartite graph is measured. Then, if there is at least one "leaf variable node, 
renormalize a leaf variable node farthest from the target node, otherwise, 
renormalize a leaf check node farthest from the target node, and otherwise 
10 renormalize a variable node farthest from the target node and having fewest 
?Q directly connected check nodes. Leaf nodes are connected to only one other node 

m 

IH in the graph. 

y 

j« When the number of nodes in the graph is less than the predetermined threshold, 

q 15 the decoding failure rate for the target node is determined exactly. 

In 

tl The present invention provides a method for optimizing error-correcting codes by 
^ searching for the error-correcting code of a specified data block size and 

transmission rate with the best performance in terms of decoding failure as a 
20 function of noise. The decoding failure rates for transmitted variable bits are used 

to guide the search for an optimal code. 
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Brief Description of the Drawings 

Figure 1 is a prior art bipartite graph representing an error-correcting code 
including a loop; 

Figure 2 is a prior art bipartite graph representing a simple prior art error- 
correcting code; 

Figures 3a-e are bipartite graphs renormalized according to the invention; 
Figure 4 is a bipartite graph to be renormalized according to the invention; 
Figure 5 is a bipartite graph with loops to be renormalized; 



V. 

q 15 Figure 6 shows an expansion of a bipartite graph to be renormalized; 

jij Figure 7 is a bipartite graph with a loop representing a generalized parity check 
^ matrix to be renormalized; 

20 Figure 8 is a graph comparing methods of evaluating error-correcting codes; 

Figure 9 is a flow diagram of a renormalization group method according to the 
invention; and 

25 Figure 10 is a flow diagram of the method according to the invention for a graph 
with loops. 
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Detailed Description of the Preferred Embodiment 
Renormalization Group Method 

Our invention evaluates error-correcting codes by using "real-space" 
renormalization group transformations. Our renormalization group (RG) method is 
adapted from techniques for the analysis of magnetic spin systems described by T. 
Niemeijer and J. M. J. van Leeuwen in "Phase Transitions and Critical 
Phenomena," C. Domb and M. S. Green editors, Vol. 6, Academic Press, London, 
1976. Renormalization groups have never been applied to error-correcting codes 
used to store and transmit data. 

To evaluate the performance of a large but finite error-correcting code, we 
iteratively replace the large code with a slightly smaller code that has the same or 
nearly the same performance, i.e., decoding failure rate. In particular, at each step 
in our iterative method, we keep a Tanner graph and a set of probability variables 
p ia and q ai associated with messages transmitted by the nodes of the graph. For the 
purpose of describing the present invention, we call the combination of the Tanner 
graph and the p and q variables a "decorated Tanner graph." 

The basis of our RG method is theJ^jG^fansformation by which we iteratively 
eliminate, i.e., "renoraiali^^single nodes in the decorated Tanner graph, and 
adjust the remainipg^values of the p and q messages so that the smaller error- 
correcting cpde has a decoding failure rate as close as possible to the replaced 
code. .X 
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Thus, with each renormalization step, the decorated Tanner graph representing our 
code shrinks by one node, until the graph is finally small enough that the 
performance of the error-correcting code can be determined in an efficient manner. 
In contrast, prior art density evolution methods never change the number of nodes, 
or the way the nodes are connected to each other. In other words, their graphs are 
static, while our graphs change dynamically during renormalization. When our 
graph is small enough, e.g., when the number of remaining nodes is less than a 
predetermined threshold, the failure rate can readily be determined. In one 
embodiment, we reduce the graph down to a single node. 

Figure 9 shows the steps of the general method 900 according to our invention. For 
each selected "target" variable node i in our decorated Tanner graph for which the 
decoding failure rate b t is to be determined, repeatedly renormalize a single node 
from the graph, other than the target node, until a predetermined threshold 950 is 
reached. The threshold can either be expressed as a desired number of nodes, or a 
desired failure rate for the remaining node. Nodes are renormalized as follows. 

Measure 910 the "distances" between every other node and the "target" node L The 
distance between two nodes is the minimal number of nodes through which one 
passes to travel from one node to the other, i.e., the number of intervening nodes. 

If there are any "leaf variable nodes, then renormalize 920 a leaf variable node 
farthest from the "target" node. A node is a leaf node when it is connected to only 
one other node in the graph. Ties in distance can be broken randomly. 
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Otherwise, if there are no "leaf variable nodes, then renormalize 930 a "leaf 
check node farthest from the "target" node. Again, ties can be broken randomly. 

Otherwise, if there are no "leaf cheptfnodes renormalize 940 a variable node from 
among those farthest from the target node that is directly connected to the fewest 
number of check nodes. Ag^fn, ties can be broken randomly. 

The renormalization steps are iteratively repeated until the graph has been reduced 
down to the desired number of nodes. The details of these steps are described in 
greater detail below. 

The above steps can be applied to as many target nodes as desired; for example, 
the average failure rate of every node in the graph can be determined. In a practical 
application, the renormalization is applied to target nodes representative of groups 
of like nodes. Then, it can be determine if the group of nodes is "strong" or "weak" 
with respect to its error-correcting capabilities. This information can then be used 
to improve the overall performance of the error-correcting code. For example, 
nodes in weak groups can be connected to more parity checks. Thus, the invention 
provides means for improving error-correcting codes in a structured manner. 

The RG Transformation for a Decorated Tanner Graphs with no Loops 

First, we consider loop-free Tanner graphs, and determine the RG transformations 
that are sufficient to give exact failure rates for such error-correcting codes. Then, 
we extend the RG transformations in order to obtain good approximate results for 
Tanner graphs with loops. 
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We always initialize our decorated Tanner graph such that all b f = x, p ia = jc, and 
all q ai = 0. We are interested in Jftit decoding failure rate b t at the specified target 
variable node i. Our method/900 obtains by repeatedly renormalizing nodes, one 
node at the time, as described above, other than the target variable node / itself. 

The first possibility is to renormalize the farthest "leaf variable node that is 
connected to the target node i. Clearly, wherf that leaf node "vanishes," p ia and q a 
are also discarded. We also renormalize^ll the q aj variables leading out of the 
target node i to other variable nodes//. Our formulation for this renormalization is: 
^^1-(1-^)(1- A J / (37) 

where the left arrow indicates tfiat we replace the old value of q aj with the new 
value. Notice that each renormalization of q aj increases its value. 

When we renormalize a "leaf check nod^rthat is only connected to a single 
variable node, we adjust the values^ffall the p ib variables leading to other checks 
nodes b to which the target npde i is attached. The renormalization group 
transformation is 

<- p*yf (38) 

Notice that each renormalization of p ib decreases its value. At the same time, we 
should also renormalize the bj as follows: 

(39) 



The renormalization of j 
graphs. 



ID is described in greater detail below for loopy 
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When only the "target" node i remains, we use the current value of node bi as our 
RG prediction of the average Mture rate. As stated above, these steps can be 
repeated for any number of^farget nodes. 
5 ' 
Example 



To better describe the RG method according to our invention, we give the 
following simple example. Recall the error-correcting code defined by the parity 
10 check matrix 

A 1 0 0^ 



0 111 



(40) 



w(\We desire to determine the decoding faihire rate at the second variable node Z> 2 - We 



initialize p = p 2 ,= p 22 = Pj2 = P42 = xJlii = Qi2 = qn = q23 = qi4 = 0, and b 2 = 0. 
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As shown in Figure 3(a), we decorated the Tanner graph for this code. All of the 
variable nodes other than variable node 2 are leaf nodes, so we can renormalize 
any of them. According to our general method 900, we renormalize the node 
farthest from node 2, breaking ties randomly. If we select variable node 4, then we 
20 discard p 42 and q 2 4 and obtain new values q 22 = x, and q 2 3 - x using equation (37). 

The new reduced size decorated Tanner graph is shown in Figure 3(b). Next, we 
renormalize variable node 3. We discard nodes p 32 and q 2 3, and renormalize node 
q 22 to the value 1 - (1 - jc) =2x-x . The even smaller decorated Tanner graph is 
25 shown in Figure 3(c). Next we renormalize variable node 1 . We discard nodes p n 
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and q\u and obtain yet smaller renormalized value q J2 = x. The Tanner graph is 
shown in Figure 3(d). Next we renormalize check node 2. We can discard nodes 
P22 and q 22 and obtain p 2J = b2 = 2jc - jc as shown for the Tanner graph in Figure 
3(e). Finally we renormalize check node 1. This leaves us with only a single 
variable node, our original target node 2, and b 2 gets renormalized to its correct 
failure, b 2 = 2x 3 - jc 4 , as described above. 



This example makes it clear why the RG method is exact for a codeword defined 
on a graph without loops. The RG transformations essentially reconstruct the 
density evolution equations of the prior art, and we know that density evolution is 
exact for such an error-correcting code. The advantage of our RG method is that it 
gives a much better approximation for error-correcting codes represented by 
bipartite graphs with loops. It is well understood, that good error-correcting codes, 
even of moderate size, will always have loops, mainly because loops provide 
redundancy without substantially increasing the size of the code. 

The RG Method for a Graph with Loops 

For an error-correcting code represented by a^raph with loops, we eventually have 
to renormalize 940 a node that is not a "leM" node. Note that we never have to 
renormalize a non-leaf check node. To/Go this, we first collect all the check nodes 
a, b, etc. connected to the target node i. We discard q ah q bh p ia , p ib , etc. For any 
given check node attached to nod//, e.g., check node a, we also collect all the 
other variable nodes j attached to node a, and renormalize the values of q aj . 
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The renormalization of the q aJ variable can be done to varying degrees of accuracy. 
The simplest method uses equation (37) directly. The problem with this method is 
that the value of p ia is always an over-estimate. Recall that p ia decreases with every 
renormalization. Because we are renormalizing the ith node before it has become a 
leaf node, p ia is not yet been fully renormalized, and is thus over-estimated. 



Instead of using p ia directly, we could use the value that it would have after we 
renormalized all the check nodes connected to it. That is, we could replace p ia in 
equation (37) with an effective p ia Qff given 

p::=p ia n w 

beN(i)\a 



On the other hand, we know that the values of the q bi are under-estimates because 
they have not yet been fully renormalized either, so p ia eff also is an under-estimate. 
We could attempt to correct this mistake by going another level further. Before we 
estimate a p ia Gf \ we first re-estimate the q bi which feed into it. Thus, we replace the 
p ia in equation (37) with an effective p ia eff given by 

pt=p, a n <$• (42) 

keN(i)\a 

where qbf'm in turn given by 

q*=\-{\-q hi ) J] (I-Pj- (43) 



Putting all these together, we finally get the RG transformation 



9«<-l -(!-?„) 



i-p* n 

beN(i)\a 



keN(b)\i 



(44) 



The RG transformation of equation (44) is worth describing in greater detail. 
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In Figure 4, we show the Tanner graph where variable node i is attached to three 
checks node a, b, and c, and check node a is attached to a variable node j. Check 
nodes b and c are connected to variable nodes labeled k, I, m, and n. We would like 
to know the probability q aj that check node a sends variable node j an erasure 
message, taking into account the information that flows through variable node i. 

We already have some previous accumulated probability q aj that check node a 
sends to variable node j in an erasure message because of other nodes previously 
attached to check node a that have already been renormalized. 

The new probability of an erasure message can be determined from a logical 
argument: 

"m aJ is an erasure it was already or if m ia is an erasure 
and 

(m bi or m k b or m^ are erasures) and (m cl or m mc or m nc are erasures)." 

Converting such a logical argument intp'an equation for probabilities is 
straightforward. If we have "mi and^n 2 " for two messages in a logical argument, 
then we translates these terms to/pi p 2 ) for the corresponding probabilities, while 
"mi or m 2 " translates to (1 - (lfp\) (1 - pi))- Converting our full logical argument 
for Figure 4 into an equatior/for probabilities, enables us to recover an example of 
the RG transformation of /quation (44). 

We always have an RG transformation for q aj correspond to the logic of the local 
neighborhood around the variable node i that we are renormalizing. In fact, the RG 
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transformation given in equation (44) is appropriate if the local neighborhood of 
node / is tree-like, but should be adjusted if there are loops in the local 
neighborhood. 

For example, the graph in Figure 5 shows a case where a variable node k is 
attached to two check nodes b and c, which in turn are each attached to variable 
node i that is to be renormalized. Note that before check nodes b or c are 
renormalized, the probabilities and p kc that variable node k sends out an erasure 
must be identical, because all renormalizations of pkb and p kc happen in tandem. 

Our logic argument for whether check node a sends variable node j an erasure 
message is thus: 

"m a j is an erasure if it was already or if {{m ia is an erasure) 
and 

((m kb is an erasure) or (m bi and m ci are erasures)))." 

At this stage in the renormalization process, if m kb is an erasure, then m kc must be 
as well. Converting our logic argument into an RG transformation, we get 



The step for renormalizing a non-leaf variable node i can be made increasingly 
accurate by increasing the size of the neighborhood around the node that was 
treated correctly. Naturally, as we increase the size of the neighborhood, we must 
pay for the increased accuracy with greater computation. 



q aj <- 1 - (1 - q aj )(1 - p m (1 - (1 - p kh )(1 - q bi q ci ))) . 



(45) 
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Figure 10 summarizes a method for renormalize a variable node that is not a leaf 
node, i.e., step 940 of Figure 9. 

Step 1010 enumerates all check nodes a, b, etc. which are connected to the variable 
node to be renormalized and discards all the probability variables p ia , p ib , q ai , q bi , 
etc. between these check nodes and the variable node. 

For each variable node j attached to one of the neighboring check nodes a, 
renormalize the value of q aj according to the following sub-steps: 

Step 1020 finds all check nodes and variable nodes in a local neighborhood to a 
predetermined distance from the variable node i to be renormalized. The distances 
are measured as described above. 

Use a logical argument to determine 1030 which combinations of erasure messages 
cause the message from check node a to variable node j to also be an erasure. 

Translate 1040 the logical argument into a RG transformation for q aj as describe 
above. 
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Finishing the RG Transformation Exactly 

As stated above, our RG method can always renormalize nodes until just the 
"target" node i is left, and then determine the decoding failure rate b im On the other 
5 hand, we can also renormalized a sufficient number of nodes to make an exact 
determination. 




JJUS 



For the purposes of describing the exact determination, we instead represent the 
error-correcting code by a Tanner graph of Mnodes, and an associated an erasure 
10 probability x t with each node / of the graph. This "erasure graph" is different than 
•j3 the decorated Tanner graphs 301-305 described above. The decoding failure rate 
can be determined exactly for an erasure graph, but the exact computation is only 
practical if the number of nodes in^the erasure probability graph is small. We 
describe how to determine the decoding failure rate exactly for an erasure 
15 probability graph, and then de/cribe how to convert a decorated graph according to 
the invention into an equivalent erasure graph. 

To determine exactly the decoding failure rate of a given node i, we generate all 2 N 
possible received messages, ranging from the all-zeros message to the all-erasures 
20 message, and decode each of message using the BP decoder. 

Each message has a probability 

/^n^na-*,) (46) 

where the first product is over all nodes that are erased and the second product is 
25 over all nodes that are not erased. We determine b x by taking the weighted average 
over all possible received messages of the probability that node i decodes to an 
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erasure. Because the complexity of the exact calculation is 0(2 ), we restricted 
ourselves to a small N 9 but nevertheless we can gain some accuracy by switching to 
an exact calculation after renormalizing sufficient nodes. 

The one subtlety in the exact final calculation is that we now represent the error- 
correcting code by a Tanner graph and associated erasure probabilities jc; at each 
variable node i. In contrast, the general RG method uses just a decorated Tanner 
graphs. Fortunately, it is possible to convert a decorated Tanner graph into an 
erasure graph. Note that at each step of the RG method, all the probabilities q ai 
leading out of the check node a are equal, i.e., q ai = q a9 and all the probabilities p ia 
leading out the variable node i are equal, i.e., p ia - p h 

We can set all the q a probabilities equal to zero by adding a new variable node £, 
with probability of p^ = q aj to node a in the erasure graph. When we are left with 
a decorated Tanner graph, such that all q probabilities are zero, and all p ia 
probabilities coming out of each variable node are equal to p { . We interpret pi as 
the erasure probabilities of the variable nodes. 

Figure 6 shows an example of expanding a decorated Tanner graph 601 to an 
equivalent erasure Tanner graph 602 with erasure probabilities. 

Extension to Generalized Parity Check Matrices 

Generalized parity check matrices define many of the modem error-correcting 
codes, such as turbo-codes, Kanter-Saad codes, and repeat-accumulate codes. In 
the generalized parity check matrix, additional columns are added to a parity check 
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matrix to represent "hidden" nodes. Hidden nodes have state variables that are not 
passed to other nodes, i.e., the states of the hidden nodes are "hidden.". A good 
notation for the hidden state variables is a horizontal line above the corresponding 
columns. For example, we write 
( \ 1 0 1 0 0^ 

(47) 



A- 



10 10 10 

0 110 0 1 

v J 



to indicate a code where the first variable node is a hidden node. To indicate that a 
variable node is a hidden node in our graphical model, we use an open circle rather 
than a filled-in circle. Such a graph, which generalizes Tanner graphs, is called a 
\D "Wiberg graph," see N. Wiberg, "Codes and decoding on general graphs," Ph. D. 
m 10 Thesis, University of Linkoping," 1996, and N. Wiberg et al., "Codes and iterative 
y decoding on general graphs," Euro. Trans. Telecomm, Vol. 6, pages 513-525, 
1995. 



Figure 7 shows a Wiberg graph representing an error-correcting code defined by 
15 the generalized parity check matrix or equation (47). 

The generalization our RG method 900 to handle Wiberg graphs, we initialize the 
probabilities p ia coming out of a hidden node to one, instead of at the erasure rate jc, 
as we do for ordinary transmitted variable nodes. This reflects the fact that hidden 
20 nodes are always erased, while ordinary variable nodes are erased with a 
probability of x. 
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Comparison with Numerical simulations 



We now describe some experimental predictions of our RG method. We define a 
(3,5) regular Gallager error-correcting code by a parity check matrix corresponding 
5 with N = 60 and k=36. That is, each of the 36 rows in the parity check matrix has 
five entries that are ones, and the rest are zeros, and each of the 60 columns has 
three entries that are zeros. There are no hidden nodes. No two parity checks share 
more than one variable node. This means that all local neighborhoods of nodes are 
tree-like. Therefore, we use the RG transformation (44) whenever we renormalize 
10 a non-leaf variable node. We renormalized nodes until we are left with seven 

3 5 

\n nodes, and then finish the determination exactly. 

s ?5 

Wi 

fVf We consider erasure rates x at intervals of 0.05 between x = 0 and x = 1 . When we 

3 ~» 
:«5 ?■ 

lis use the RG approximation, we average our decoding failure rates 6, over all nodes i 
jp 15 to get an overall bit error rate. Our experiment includes a thousand trials at each 
erasure rate, while we decode according to the standard BP decoding method. 

M 

l ~ Figure 8 shows our experimental results. In Figure 8, the x-axis is the erasure rate, 
and the y-axis the bit error rate. Curve 81 1 is the prior art density evolution 
20 prediction, curve 812 the RG theoretical prediction, and the open circles 813 our 
experimental results. Figure 8 clearly shows that the density-evolution prediction 
has a threshold-like behavior and is completely incorrect for small or medium 
block-lengths, whereas the RG method according to the invention is not. 
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Extension to a Gaussian Noise Channel 

We extend our RG method so that it can also be used with an additive white 
Gaussian noise (AWGN) channel. We do this by adapting a Gaussian 
approximation to density evolution for the AWGN channel as described by Chung, 
et. al. in "Analysis of Sum-Product Decoding of Low-Density Parity-Check Codes 
Using a Gaussian Approximation," IEEE Trans. Info. Theory, Vol. 47, No.2, pages 
657-670, 2001. We first describe that approximation. 

In the AWGN channel, there are only two possible inputs, 0 and 1, but the output is 
a set of real numbers. If x is the input, then the output is y = (-l) x + z, where z is a 
Gaussian random variable with zero mean and variance a 2 . For each received bit i 
in the code, the log-likelihood ratio m, 0 = Xnipiyfo = 0)/p(yi\xi = 1)) determines the 
relative log-likelihood ratio that the transmitted bit i was a zero, given the received 
real number is y,. 

The error-correcting code is defined by the generalized parity check matrices, as 
described above. The all-zeros codewords are transmitted, and the decoding 
process is the sum-product belief propagation process. In this decoding process, 
real-valued messages are iteratively solved as functions of each other. The types of 
real-valued messages which are used are m ia from variable nodes i to check nodes 
a; and m ai from check nodes a to variable nodes i. 

The messages m M are log-likelihood patios by which the variable node i informs the 
check node a of its probability of being either a zero or a one. For example, 
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m ia — » oo means that node i is certain it is a zero, while m ia = 1 means that variable 
node i is signaling check node a Inat ln(/?(jc/ = 0)/p(xj = 1)) = 1. The messages mai 
are log-likelihood ratios interpreted as information from the check node a to the 
variable node i about the state of variable node /. 



ul5 



In sum-product decoding, the messages are iteratively solved according to the 
update rules: 



m 



beN(i)\a 




if node / is a hidden node, the mf is omitted, and 
tanh(m fl/ / 2) = tanh(m ja 1 2) . 

jeN(a)\i 



(48) 



(49) 



In the density evolution method for the AWGN cjiannel, one considers the 
probability distributions p(m ia ) and pL^ffov the messages where the probability 
distribution is an average ovepaffpossible received blocks. A distribution f(x) is 
called consistent \ff(x)^%-x)e x for all x. The consistency condition is preserved 
for the messag^^robability distributions for all messages under sum-product 
decoding 



If the probability distributions p(m ia ) and p(m ai ) are approximated as Gaussian 
20 distributions, then the consistency ^odition means the means fx of these 

distributions are related ta*h^ variances <7 2 by cr 2 = 2// . This means that the 

message probability distributions can be characterized by a single parameter: their 
mean. 
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Thus, by making the approximation that the message probability distributions are 
Gaussians, the density evolution equations for the AWGN channel can be reduced 
to self-consistent equations for the means u ia of the probability distributions of 
messages from variable nodes / to check nodes a, and the means v ai of the 
5 probability distributions of messages from check nodes a to variable nodes i. These 
equations are 

V; a ="°+ I" W (50) 

beN(i)\a 

where u° is the mean value of m^, and is omitted for hidden nodes, and 

/GAT(fl)\/ 

o 

03 10 where (j> (x) is a function defined by 

m . . 1 r- , u 

fD 



ffl 



in 



<p(x) = 1 — == r tanh -e 4x du. (52) 

y/47DC J ~~ 2 



RG transformations for the AWGN channel 

15 The density evolution equations (50) and (51) for the AWGN channel under the 
|3Sft Gaussian approximation are analogs of the density evolution equations (4) and (3) 
for the BEC channel. Our RG procedure for the AWGN channel is substantially 
the same as for the BEC channel. The main difference is that we change the RG 
transformations. 

20 

S^^y ^ USt aS k e ^ ore ' we construct a set of RG transformations which exactly reproduce 
the density evolution equations for a treelike graph. We generate a decorated 
Tanner/Wiberg graph for the code b weeping u ai and v ia variables between each 
pair of connected nodes. The u ai variables are initialized to infinity, while the v ia 
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variables are initialized to u°, unless^the /th node is a hidden node, in which case 
the are initialized to zero. We also introduce the variables h h analogous to bi in 
the BEC, which are initialled like the v ia variables. 

If we renormalize a leaf check node a connected to a variable node i, then we find 
the other check nodes b attached to i and apply the RG transformations 

w*<-^+m., (53) 

and 

/*,.<-/*,.+ w,. 

If we renormalize a leaf variable node / connected to a check node a, we find the 
other variable nodes j attached to check node a and apply the RG transformation 

u aj <- 0~ X (1 " (1 - ))(1 - </>(v ia ))) . (55) 



~ 15 Note that with each renormalization of v iby the magnitude of v ib increases, while 

y i 



with each renormalization of u aj , and the magnitude of u aj decreases. 

When we renormalize a non-leaf variable node / which is connected to check nodes 
a, b, etc., we renormalize the variables like u a j 9 where j is another variable node 
20 connected to check node a. Just as for the BEC, we consider a local neighborhood 
of nodes around the node i. For example, if the neighborhood of check nodes 
connected to i and other variable nodes connected to those check nodes is tree-like, 
we use the RG transformation 

u. <- 0- (1 - (1 - (P(u aj ))<1 - <P(v;? ))) , (56) 

25 where 
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(57) 




The RG method proceeds as in the BEC case, until the determination of the bit 
error rate. For the AWGN channel, it is normally inconvenient to stop the RG 
5 method before renormalizing all the way down to the "target" node, because it is 
not simple to do an exact computation even with just a few nodes in the graph. 

When we have renormalized all but our target node U we are left with a final 
renormalized value of h v The Gaussian approximation tells us that the probability 
;Sj 10 distribution for the node i being decoded as a zero is a Gaussian with mean hi and 



variance 2/i,. Decoding failures correspond to those parts of the probability 
distribution which are below zero. Thus, our prediction for the bit error rate (ber,) 
at node i is 



Generating Error- Correcting Codes 

Given that the density evolution method has been used as a guide to generate the 
best-known practical error-correcting codes, we can generate even better codes 
20 using our RG method. With the RG method according to our invention, we can 
input a code defined by an arbitrary generalized parity check matrix, and obtain as 
output a prediction of the bit error rate at each node. 




(58) 
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We can use this output in an objective function for a guided search through the 
space of possible improved code. For example, we can try to find a N = 100 
blocklength, transmission rate 1/2 code with no hidden states that achieves a bit 
error rate of less than 10* 4 at the smallest possible signal-to-noise ratio for the 
5 AWGN channel. We do this by iteratively evaluating codes of the correct 
blocklength and rate, using our RG method, and using any known search 
techniques, e.g., greedy descent, simulated annealing, genetic process, etc. to 
search through the space of valid parity check matrices. 



10 Because we directly focus on the correct measure of merit, i.e., the bit error rate 

O 

Jq itself, rather than the threshold in the infinite block-length limit, the search 

according to the invention improves on the results obtained using the prior art 
density evolution process. We can guide the search because we have information 
about the bit error rate at every node. For example, it might make sense to 
15 "strengthen" a weak variable node with a high bit error rate by adding additional 
parity check nodes, or we can "weaken" strong nodes with a low bit error rate by 
turning the weak nodes into hidden nodes, thus increasing the transmission rate. 



if** 
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On the other hand, determining the bit error rate or every node slows down a 
20 search. It may be worthwhile, at least at least for large block-lengths, to restrict 
oneself to those codes for which there are efnly a small number of different classes 
of nodes, defined in terms of the local neighborhoods of the nodes. Most of the 
best-known codes are of this type. Rather than determining the bit error rate for 
every variable node, we can determine the bit error rate for just one representative 
25 node of each class of variable noaes. 
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For example, for a regular Gallager code, each node has the same local 
neighborhood, so any node can be selected as a representative of all the nodes in 
the neighborhood. The error made in this method is estimated by comparing bit 
error rates of different nodes of the same class. For actual finite-sized regular 
Gallager codes, we find that the RG method gives very similar predictions for each 
of the nodes, so that the error made by just considering a single variable node as a 
representative of all of them is quite small. 

Although the invention has been described by way of examples of preferred 
embodiments, it is to be understood that various other adaptations and 
modifications may be made within the spirit and scope of the invention. Therefore, 
it is the object of the appended claims to cover all such variations and 
modifications as come within the true spirit and scope of the invention. 
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